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Abstract 

We consider a Bayesian problem of estimating of probability of success in a series 
of conditionally independent trials with binary outcomes. We study the asymptotic be¬ 
haviour of differential entropy for posterior probability density function conditional on x 
successes after n conditionally independent trials, when n —>• oo. It is shown that after 
an appropriate normalization in cases x ~ n x ~ rU (0 < (5 < 1) limiting distribution 
is Gaussian and the differential entropy of standardized RV converges to differential en¬ 
tropy of standard Gaussian random variable. When x or n — x is a constant the limiting 
distribution in not Gaussian, but still the asymptotic of differential entropy can be found 
explicitly. 

Then suppose that one is interested to know whether the coin is fair or not and for 
large n is interested in the true frequency. To do so the concept of weighted differential 
entropy introduced in [I] is used when the frequency 7 is necessary to emphasize. It was 
found that the weight in suggested form does not change the asymptotic form of Shannon, 
Renyi, Tsallis and Fisher entropies, but change the constants. The main term in weighted 
Fisher Information is changed by some constant which depend on distance between the 
true frequency and the value we want to emphasize. 

In third part we derived the weighted versions of Rao-Cramer, Bhattacharyya and 
Kullback inequalities. This result is applied to the Bayesian problem described above. 

The asymptotic forms of these inequalities are obtained for a particular class of weight 
functions. 
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1 Introduction 


Let U be a random variable (RV) that uniformly distributed in interval [0,1]. Given a realization 
of this RV p, consider a sequence of conditionally independent identically distributed where 
= 1 with probability p and £* = 0 with probability 1 — p. Let Xj, each 0 or 1, be an outcome 
in trial i. Denote by S n — + ... + by x = (xj, i — 1, n) and by x = x{n ) = ]P / =1 x i- 

Note that RVs (£*) are positively correlated. Indeed, P(£j = 1, ^ = 1) = J 1 p 2 dp = 1/3 if i ^ j, 
but Pfa = 1 )P(£j = 1) = ( foPdp ) 2 = 1/4. 

The probability that after n trials the exact sequence x will appear: 


P (6 = x 1 , ...,£ n = x n ) = / p x (l-p) n x dp = 




(i.i) 


This implies that the posterior probability density function (PDF) of the number of x successes 
after n trials is uniform: 


P(S' n = x)= 1 , x = 0,..., n. 

{n + 1 ) 

The posterior PDF given the information that after n trials one observes x successes takes 
the form 


f P \s n (p\£i = x u---,£ n = Xn) = (n+l)(^Jp x {l-p) n x . (1.2) 

Note that conditional distribution given in 01 .21) is a Beta-distribution B(x + l,n — x + 1). “It 
is known that Beta-distribution is asymptotically normal with its mean and variance as x and 
(n — x) tend to infinity, but this fact is lacking a handy reference” (see [3, p. 1]). That is why, 
we give the proof of this fact in two cases. 

The RV Z with PDF 01.21) has the following expectation: 

E[Z<“>|S„ = *] = ^±1, (1.3) 


and the following variance: 


V[zW\S n = x] 


(x + l)(n — x + 1 ) 
(n + 3)(n + 2 ) 2 


(1.4) 
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Recall: hd(f) is the differential entropy of some RV Z with PDF /: 

hd(f) = ~ [ f(z)\og{f{z))dz (1.5) 

Jr 

with convention OlogO = 0. Note that after a linear transformation of RV Z to RV X with 
some PDF g{x) where X = d\Z + d ,2 differential entropy of RV X transforms in the following 
way mm-- 

h d (g) = h d (f) + logdi (1.6) 

Let Z be a standard Gaussian RV with PDF ip then the differential entropy of Z |13] : 

hM = ^ lo g (27re). 

The goal of the first part of the work is to study the asymptotic behaviour of differential 
entropy of the following RVs: 

1. Zn l ' > with PDF fa '- 1 given in (11.21) when x = x(n) ~ cm, where 0 < a < 1 

2. zjfi' 1 with PDF fp 1 ' 1 given in (11.21) when x = x(n) ~ nfi where 0 < (3 < 1 

3. z[ n) with PDF fx"' 1 given in (1 1.2 [) when x = c\ and z!ff} x with PDF given in (11.21) 

when n — x(n) = c 2 where c± and C2 are some constants. 

We will demonstrate that the limiting distributions of standardized RV when n —> 00 in the 
cases 1 and 2 are Gaussian. However, the asymptotic normality does not imply automatically 
the limiting form of differential entropy. In general the problem of taking the limits under the 
sign of entropy is rather delicate and was extensively studied in literature, cf., i.e., [6) |T2]]- In 
the third case the limiting distribution is not Gaussian, but still the asymptotic of differential 
entropy can be found explicitly. 

In second part of the paper (section 3) we suppose that one is interested to know whether 
the coin is fair or not and for large n is interested in true frequency. So the goal of a statistical 
experiment in twofold: on the initial stage an experimenter is mainly concerns whether the coin 
is fair (i.e. p — 1/2) or not. As the size of a sample grows, he proceeds to estimating the true 
value of the parameter anyway. We want to quantify the differential entropy of this experiment 
taking into account its two sided objective. It seems that quantitative measure of information 
gain of this experiment is provided by the concept of weighted differential entropy in h m ns] • 
In our case <f>(x) is a weight function that underline the importance of 0.5. 

The goal of the second part of work is to study the weighted Shannon (11.151) . Renyi (II.8)) . 
Tsallis (II.9|) and Fisher (11.161) entropies |5]: 

^(f) = ~ [ fi {n) {p)f(pfiogf(p)dp, (1.7) 

Jr 

Hid) = 7“—l°g f dd)Y (1.8) 

1 — v Jr 

Std) = ;pT (i - Jj M (z)d(z))-dJ\ (1-9) 
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( 1 . 10 ) 


I*(g) = E 



(|log/(Z;<l)) 2 



where Z = is a RV with PDF / given in (11.21) and (j)^ n \p ) is a weight function that 
underline the importance of some particular value. The following special cases are considered: 


1. 0(»)(p) = 1 

2 . (f>( n \p) depends both on n and p 


We will denote by 7 the frequency that we want to emphasize (the 0.5 in the example 
above). We assume that <j)(x) > 0 for all x. Choosing the weight function we adopt the 
following normalization rule: 

f = 1 ( 1 . 11 ) 

Jr 

It can be easily checked that if weight function (j)^ n \p) satisfies (I3.33P then the Renyi 
weighted entropy (ll. 8 |) and Tsallis weighted entropy (II.9p tend to Shannon’s weighted entropy 
as v —> 1 and q —>■ 1 correspondingly. 

Considering the goal of including the weight function - emphasizing some particular value, 
we consider the following weight function: 


</> ( n \p) = A (n) ( 7 )p 7 ^(l - (1.12) 

where A^( 7 ) is found from the normalizing condition (13.331) and is given explicitly in (13.ip . This 
weight function is selected as a model example with a twofold goal to emphasize a particular 
value 7 for moderate n, while preserving the true frequency p*. 

In the third part of paper (Section 4,5 and 6) we recall the statistical experiment with binary 
outcomes where the main objective is to find out whether the probabilities of success and failure 
are equal. In other words, the statistical decisions in a neighbourhood of a particular value 
7 = 1/2 are especially sensitive. It is clear that if an experimenter wrongly declares that the 
parameter of interest is in a small neighbourhood of particular value 7 = 1/2 than the penalty 
of this error should be more severe than for a similar error far from the sensitive area. Similar 
models of sensitive estimator appear in many fields of statistics. For this reason we start with 
the general framework and then specialize it to the case of binary trials as an example. 

Consider RV Zel* 1 with PDF /( z) or family of RV Z g e R d with PDF f g where 6 e 0 C 
is the vector of parameters of PDF f e . Denote z = [z±, ..., Zd] T . Let </>(.) be the positive weight 
function that emphasizes particular value 7 , E/(Z) be the weighted expectation of random 
vector Z with PDF f g 

g{6) = Eg(Z) = / z/ e (z)0(z)dz (1.13) 

J R d 

and Efl(Z) be the classic expectation of random vector Z with PDF f g 

e(0)=E*(Z)= / z/ fl (z)dz. (1.14) 

J R d 
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Quantitative measures of information gain of experiments of the type described above are 
provided by the weighted Shannon differential entropy mmm 


h?(fe) — ~ 0(z)/ e (z)log/ e (z)dz, 


(1.15) 


the weighted ( m x m) Fisher information matrix 

8 


l* {9) = Eg 


89 


l°gMZ) 


8 _ 

86 


log/ fl (Z) 


(1.16) 


where is the notation for the gradient (the vector J^log/^Z) is the score), and the weighted 
Kullback-Leibler divergence of g from / [IT] 

= [ 0( z )/( z )log^ydz. (1.17) 

jR d 9\ z ) 

For simplicity we assume that the inverse Fisher matrix exists, but, in a general case, under 
inverse we understand the Moore-Penrose pseudoinverse. Also it is shown that in this context 
it is more convenient to study the calibrated Kullback-Leibler divergence defined in [H]: 

K*(f\\g)= f ^(z)Adio g /(z) (: ( A z = p(/na a.is) 

V C{f) g{z)C{f) 

where C(f) = f Rd 0(z)/(z)dz, / = 0(z)/(z)C*(/) _1 and B(/||c/) is the standard Kullback- 
Leibler divergence of g from / 

/( z )lo g44dz. (1.19) 

9l z ) 

The goal of the third part is twofold. Firstly, the weighted analogous of the Rao-Cramer, 
Bhattacharyya and Kullback inequalities will be derived in a general case. Secondly, these 
inequalities will be illustrated in the example described above which has an independent interest. 




2 Asymptotic of Shannon’s differential entropy 

Theorem 1. Let Z= n^(a( 1 — a))~^(zZ' > — a) be a RV with PDF fa l \ Let Z ~ A/"(0,1) 
be the standard Gaussian RV, then 
(a) Za weakly converges to Z: 


Z^ as n —> oo. 

(b) The differential entropy of Z^ converges to differential entropy of Z: 

lim h(f { ff ] ) = hog (2vre). 

n—>-oo Z 

(c) The Kullback-Leibler divergence of p from fo l> tends to 0 as n —>■ oo: 

hmD(/W|| V> ) = 0. 
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Proof, (a) Let x — x(n) = an where 0 < a < 1 and consider RV 

Z(">=ni(a(l- a ))-i(Z<r>-a). 

We proceed by the method of characteristic functions, and establish that: 


0(f) = E[e 


itZ. 




—> e 


-t 2 / 2 


( 2 . 1 ) 


( — ot)y/n 1 


Py/n 


for all f G M. Indeed 

/' 1 24 (P"a)v^ AA • ■ 

0 (f) = / e V“( 1 -“) f^\p)dp = (n + 1) f je V a(1 “ a) / e VA-^p^l — p) n_x dp 
and consider the integral: 

f 1 n(it . f +nlogp+(l-a)log(l-p)) 

J(f, a, n) = / e dp. 


( 2 . 2 ) 


Denote g(p) = it- 


+ alogp + (l —a)log(l—p). The integrand in (12.2(1 has a narrow sharp 


a(l—a)n 

peak, and the integral is completely dominated by the maximum of Re[p(p)] when n —> oo. For 
hxed values of f, a and n —* oo, it can be studied by the saddle point method 0 Theorem 1.3, 
P-170]: f _ 

1 


J(f, a, n) ~ e n9{ - p *\ 


2 tt 

-ng" (p* 


1 + 0 


77, 


(2.3) 


Find the point of maximum of Re[p(p)] and deform initial contour [0,1] into the steepest descent 
contour through the saddle point: 


— a) 

p * = a + it^ y ’ 


a 


n 


+ 0 (-]. 

n 


So, 0(f) takes the form: 


m = e~<\n + 1) C) frTd - + O (1) ■ 

Here and below x = Next, by Stirling’s formula: 

77, n / n 

x x (?7 — x)( n_ D Y 2nx(n — x) 

So, the straightforward computation yields: 

— a)0~ a: )e it '^( 1 ~+« n + (1 ~2 (l-a)an+x — 


It can be checked that next term in asymptotic of logp* (as well as log(l — p*)) is decaying to 
0 after multiplication of an and (1 — a)n, correspondingly. 

We have for f e M 

_ t 2 (77 + 1)77" I n P_/x\ x fn — x\ n 

x x {n — x)( n-x ) y 27 tx(t 7 — x) & V77/ \ n ) 

^ e 2 



— p*) n ~ x ~ a x (l 
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This fact establishes pointwise convergence of characteristic function to its Gaussian limit and 
it completes the proof of part (a). 

(b) Write the differential entropy in the form: 


= - log 


[n + 1 ) 


n 


x 


n 


x 


+ {n + 1 ) xif + (n + 1 ) (n - x)I 2 


n 


x 


where 


h= p x (l-p) n x logpdp, 


h= / p x (l-p) n - x log(l-p)dp. 

Jo 

Integrals if and J 2 can be computed explicitly by reducing to the standard integral 

1 


x M #1 — x r ) u Mogxdx = —B ^ —, n'j ^0 j — 0 ^— + 


v 


(2.4) 


(2.5) 

( 2 . 6 ) 


(2.7) 


where 0(x) is the digamma function, and B(x,y) is the Beta-function [9[ #4.253.1] and in 
considering case r = 1, p — 1 = x, u — 1 = n — x. 

For integral if, we get: 


U\ — in + 1 ) ( U )xif = —x( 0 (n + 2 ) — 0 (x + 1 )). 


Similarly, for the second integral J 2 , we obtain: 


n 


x 


t / 2 — (n + 1 ) (n — x )/ 2 = — (n — x)( 0 (n + 2 ) — 0 (n — a: + 1 )). 


After summation of these two integrals and using the asymptotic for digamma function [9] 
#8.362.2], we obtain: 


Ui + U 2 — xlogx — nlogn + (n — x)log(n — x) — — + O 


Next, we apply the Stirling formula to the first term in 


U 0 = log 


(n + 1) 


n 


x 


= nlogn — xlogx — (n — x)log(n — x) + 

1 


-hogn - hog a - — log (1 - a) - log(V 2 vr) + 0 (3 


n 


Here as before x = [an \. So, we obtain the following asymptotic of the differential entropy: 


lim 

n—>• oo 


h(fP) - l,og 


= 0 . 


( 2 . 8 ) 


Due to (11.6|) . the differential entropy of RV has the form: 


lim h(f^) =-log( 2 vre) 

n—> oo L J Z 


(2.9) 
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(c) By the definition of the the Kullback-Leibler divergence: 


»(/PlM = -HfP) - f iogp(p)dp 

J 0 

= -ilog (2*e) + i log(2x) + \ dp + O (£) = O , 

fo P 2 fa n) dp — 1 + O (^) is the second moment of It completes the proof. □ 

Theorem 2. Let Z^ r,) = vf~hl 2 {Z^ — n^~ l ) be a RV with PDF fjf' 1 and Z ~ A/”(0,1) then 
(a) Z^ weakly converges to Z: 


r(n) 


Z as n —> oo. 


(b) The differential entropy of Z^ converges to differential entropy of Z: 


lirn h(fy n) ) = ^log ( 2 vre). 

n—>oo ^ Z 


(c) The Kullback-Leibler divergence of p from fjf ' > tends to 0 


as n —>■ oo: 


1 ™ B/f( n )| 


Proof, (a) Let x = x{n) = n 10 where 0 < /3 < 1 and consider zlf 3 such that 



= n 1 ”?/ 2 


& 


(n) 


n 




In this case, it is more convenient to proceed by the method of moments. We use the following 
classical result. Let f n be a sequence of distribution functions with finite moments /ifc(n), and 
/ifc(n) tends to 14 for each k as n —> 00 where Vk are moments of distribution /, and the 
distribution / is uniquely defined by its moments, then f n weakly converges to / as n —> 00 


HU- 


Consider RV Zi"' = n 1 l3 ^ 2 (Z < 'f' > — n 13 1 ) where Zi " 1 has PDF (11.21) when x = and 


compute all moments of Z^\ First, E (Z^ n> ) —>■ 0 as n —)■ 00 because E (Zjf l> ) = 1 + O (^). 


7{n) 


( n ) > 


Next, we check that E 
moments for any k > 1 : 


r(n) 


— n 2< T P/ 2 1 ~E(Zq — rfi 3 x ) 2 —> 1 as n —> 00 . Compute central 


E 



= n 


k-Sf 


“(I -n l - p )~ k (l -n^f 2 F 1 [-k,n 13 + l;n + 2;n 1 " /3 ] 


( 2 . 10 ) 


where 2 Fi[—k,n^ + l;n + 2 ;n 1 f is the hypergeometric function, which, in this case, is the 
polynomial: 

2 F\i-k, 7 / + 1; n + 2; n 1 ^} = ]T(-1)‘ ( k ) 

i=0 W \ n + ^)i 
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where ( q) n is the rising Pochhammer symbol. For n > 0 

(g)n = q(q + l)...(q + n-l) 

and (g) 0 = 1. 

Consider asymptotic of terms separately: 

n k ~^{ 1 - 77, 1_/3 ) _fc (l - ~ 0{n^) 

and 

2 Fi[—A:, r/ + 1; n + 2; n 1 "^] ~ 0( n -[o.5+o.5fc]/3) (2.11) 

where \_k\ is the integer part of k. For k odd: 

n fc(i-/3/2) E ( Z W _ = 0( n ¥)0( n -[°- 5 +°- 5fc ]/ 3 ) ~ O(n~ 0/2 ) 0 (2.12) 

as n -)• oo. For k even: 

n fc(i-/»/2) E ( Z (n) _ n /»-i)k = O(n^)O(n- [0 - 5+a5fcl/3 ) = 0(1). (2.13) 

We see that every even central moment tends to a constant which is the coefficient in front 
of term ^-[O-S+O- 5 ^/ 3 j n the hypergeometric function. For k even, we have: 

n fc(i-/3/2) E ( Z 0) _ n p-i ^ _ 1 ^,_ ( 2 . 14 ) 

These imply that RV Z ( " :> weakly converges to the standard Gaussian RV. 

(b) Write the differential entropy in the form: 


Kf i n) ) = - ( iog 


[n + 1 ) 


n 


x 


n 


x 


+ (n + l) xl\ + (n + 1 ) ( )(n-x)I 2 \ — 


n 


x 


(2.15) 


= -(Co + U x + U 2 ) 


where I\ and / 2 are defined in (12. 5 p and (12.6j) and can be computed explicitly by (12.71) . 

As before, we apply the Stirling formula for Uq\ 

Uq = nlogn — xlogx — (n — x)log(n — x) + logn 
+ ^(-log n p - log(l - n p ~ 1 )) - ilog( 2 vr) + O 

As far as 0 < j3 < 1 the reminder tends to 0 as n —> oo. Note that the rate of decaying depends 
on parameter j3, contrary to reminder in Theorem 1 . Now Ui + U 2 can be computed as follows: 


U\ + U 2 — xlogx — nlogn + (n — x)log(n — x) — - + O 


So, we proved that 


lim 

n—> oo 


(n) 1 2 vre(l - x ) 

h (ffi ) - 2 l0g -^- 


n 


= 0 


Due to (11.61) . the differential entropy of RV Z i"* has the form: 


lim h(fy n) ) = -log ( 2 vre) 
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(c) Similarly, by the definition of the the Knllback-Leibler divergence: 

D (/j n) HvO = ~ [ f^\p)^g(fi(p)dp 

Jo 

= -^ log ^ + \ 10 ^ + \ l p2 $ )ip +° (i) = ° (s>) ’ 

/oV/T’dP =1 + 0 (^) is the second moment of 

□ 


Theorem 3. Let Z^ = nZ^ be a RV with PDF fcf* and Z^} C2 = nZ^} C2 be a RV with PDF 
fVc 2 - Denote H k = 1+|+.. ,+A the partial sum of harmonic series and 7 the Euler-Mascheroni 
constant, then 


(a) lim ft(/ c <“>) 

n—> 00 


c l-l 

Ci + E lo s( c i - *) - Cl (H C1 - 7 ) + 1. 

i=0 


(b) lim h(f^ C2 ) 


C2-1 

= c 2 + ^2 log(c 2 - i) 

i =0 


C2{H C2 ~ 7 ) + 1 . 


Proof, (a) Let x = c\ where ci is a some integer constant. Consider the differential entropy: 
hfyfc'f' 1 ) = —(U 0 + Ui + U 2 ) where U 0 , I# and U 2 defined in (12.151) . Applying the Stirling formula 
for Uq. 

Uq = logn — log(x!) + xlogn + O 

Next, we compute U\ + U 2 via formula (12.71) as before. The only difference will be in asymptotic 
of digamma functions P #8.365.3, #8.365.4], because of x — <7 where C\ is constant: 

#(n — x + 1) ~ logn + 1 0~ 3: , and #(x + 1) = H x — 7 , here H x is the partial sum of harmonic 
series and 7 stands for the Euler-Mascheroni constant. Using that x = c±: 

ci—1 

lim [h(/ c (n) ) + logn] = c x + V log(ci - i) - c x {H cl - 7 ) + 1 . 

n—>-00 z ' 

z=0 

Due to (II. 6 p it can be written in the following form: 

ci-l 

lim h(/ c (n) ) = ci + V log(ci - i) - ci(H Cl - 7 ) + 1 . 

n—>-00 L —* 

z=0 

(b) Let n — x{n ) = c 2 where c 2 is some integer constant. In a similar way we compute h(f!ff} C2 ) 
where n — x = c 2 and c 2 is a constant. The asymptotic of digamma function is given as follows 
P #8.365.4]: 

if(n — x + 1) = H C2 — 7 where x = n — c 2 , 
and the final result for differential entropy: 

ca —1 

h(fn-c 2 ) = -log n + c 2 - c 2 (H C2 - 7) + log ( C2 - *) + 1 + O 

i =0 
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In terms of standardized RV Zf_ C2 we obtain due to (II. 6 |k 

C2 —1 

lim h{fi n ] C2 ) =c 2 + Y^ lo g( c 2 - 0 - c 2 (H C2 - 7) + 1. 

n—>-00 “ * ■ 

i =0 

□ 


3 Asymptotic of weighted differential entropies 


The normalizing constant in the weight function (1 1.12 p is found from the condition (I3.33p . We 
obtain that: 




T(x + l)T(n — x + l)r(n + 2 + y/n) 

T(a: + 7 y/n + l)T(n — x + 1 + y/n — 7 y/n)V(n + 2)' 


(3.1) 


We denote by ip^ix) = 'ip(x) and by ipW)(x) the digamma function and its first derivative 
respectively. 


dn+l 

^ )(x) = d^ log(r(x)) 

In further calculations we will need the asymptotic of these functions: 


(3.2) 


f>(x) = log(x) 


+ O [ — | as x —y 00 , 


2x 


x z 




as x —> 00 . 


Proposition 1. Let Z^ be a RV with - conditional PDF after n trials given by 111.2 j) . 
IV (/q 71 ) ) - the weighted Shannon entropy of Z^ given in \1.15\) . When x = an (0 < a < 1) 
and the weight function ( p ) is given in HI. 12 1) 


lim 

n—>-00 


h*(fV - ilog ( 2ma( B 1 - n) ) 


(a — y ) 2 
2a(l — a) 


(3.3) 


If the a = 7 f/rnn the asymptotic of /A*(/) is exactly the asymptotic of differential Shannon’s 
entropy with <f^ n \p) = 1 . 

Proof. The Shannon differential entropy of PDF f ( ' n \p ) = f(p ) given in (ll.2|) and weight 
function <f^ n \p) given in (I1.12p takes the form: 


h*(f) = l°g 


[n + 1 ) 


+ x log (p)<f^ n \p)f(jp)dp+(n — x) / log(l — p)f < ' n \p)f(p)dp 


The integrals can be computed explicitly [9j (page 552): 


1 (1 — x r Y 1 log(x)da; = (— , u] (ip (—') — 

r z \r / V V r / 


/i 


+ v 


Applying this formula for integral, we get: 
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/ log (p)4> ( ' n \p)f(p)dp = ip(x + z + 1) — i\){n + y/n + 2), where z = ^y/n and ip(x) is a 

Jo 

digamma function. 

/ log(l — p)f/ n \p)f{p)dp = f>(n — x + y/n — z + 1) — ^(n + y/n + 2) 

Jo 

So we have that 


h*(f) = log 


[n + 1 ) 


n 


x 


+ xt/>(x + z + 1) + {n — x)if>(n — x + y/n — z + 1) — mf (n + y/n + 2). 


By Stirling’s formula we have that for x = an: 


log 


(n + 1 ) 


n 


x 


a 


log\/27r + O 


= nlog(n) — xlogx — (n — x)log(n — x) + -log(n) — -log (a) — -log(l 


n 


Using the asymptotic for digamma function 
V>(x + £ + !) = log(x) + 


n a — 7 “ ^ / 1 

+ ' +0 


x 


2 ox 


3/2 


77 , 


^(tT, — X + \/n — Z + 1)) = log(?7, — x) + 


(1-7 ) y/n , 27-7 


a 


n — x 


2(1 — a) ( 77 , — x) 

'iffn + + 2) = log(n) + — + O (- J , 

77 , \n J 


O 


3/2 


77 , 


we get 




(3,4) 


The first term in (13.41) is differential entropy with weight (j) = 1 of Gaussian RV. Moreover, note 
that the asymptotic of the weighted entropy exceeds classical entropy studied above. The only 
difference is constant, which tend to zero if 7 — > a. □ 

Theorem 4. Let Z^d be a RV with f^d - conditional PDF after n given by sa and with 
weighted Renyi differential entropy H v (fgiven in w 

(a) When both (x) and (n — x) tend to infinity as n —* 00 in the case ft n \p) = 1, 


lim ( //, 




(3.5) 


For any fixed n when u —> 1 Renyi’s differential entropy of Z^ tends to Shannon’s differential 
entropy of Z^ n \ 

(h) When x = an (0 < a < 1) and the weighted function is given in M.12 j) 


lim (Htm ~ = + (“ - ^ 


77 


2 ( 1 - 77 ) 20 ( 1 - 0 ) 7 / 


(3.6) 


For any fixed n the Renyi weighted differential entropy tends to Shannon’s weighted differential 
entropy RV with PDF given in HI. 2 1) as v —> 1. 


Proof, (a) In this case f'/p) = 1, so the Renyi entropy have the form: 


(1-7 v)H v (f) = log / ( f(p)Y dp = 7/log 


(tt, + 1 ) 


n 


x 


+log 


p^fi-pyV-D 


= U 0 +U 1 
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^ I CM 


By Stirling formula: 


U 0 = v\og 


[n + 1) 


n 


x 


v. 


= imlog(n) — uxlog(x) — v(n — x)log(n — x) + zdog(n) + — log(n) 


Y{ux + l)Y{v[n — x) + 1) 
T(un + 2) 


log(x) - g lo s( n ~ x )~ 2 lo §( 27r ) + 0 
Consider the integral: 

[ p ux { 1 - p y^ = B (ux + 1, u(n - x) + 1) = 

Jo 

So by Stirling formula again: 

|T(z/x + l)T(u(n — x) + 1) 

U\ = log - — -—- 

f [yn + 2) 

ux\og(u) + vxlog(x) - ux + ^log(zv) + ^log(x) + ilog(27r) 

v(n — x)\og(u) + u(n — x)log(n — x) — v(n — x) + -log(i') + -log(n — x) + -log(27r) 


u’nAog(n) + un\og(n) — un + -log(z/) + -log(ra) + -log(27r) 


logO)- log (n)+0 


n 


We obtain t|iat 

G 0 + log(x) + —log(n - x) + —log(2 vr) - -log(u) + ulog(n) - log(ra) 


1 -u 


log(n) + O - = 


n 


= (1 — u)-(— log(n) + log(x) + log(n — x) + log(27r) — 21og(n)) — -log(z/) + O — 


1 — v f 2nx(n — x) 


n 


-log 


So we have that: 


n° 


-log{u) + 0[- 


n 


H„(f)=W 2lTX(n - X) 


log(^) 


+ 0 


n° J 2(1 — u) 

note that it tends to Renyi differential entropy of Gaussian RV as n —* oo. 
Taking the limit when v —» 1 and applying L’Hopital’s rule we get that: 


(3.7) 


H,- .li'/i = lim HJJ) = xlog 

i /—>1 z 


1 (2enx(n — x) 




+ 0 [ - 

n 


For example, when x = an, 0 < a < 1 the Renyi entropy: 


Hu^i(f) = -log 


1 27re[a(l — a)] 


n 


+ 0 - , 


n 


(3.8) 


where the first term is Shannon’s entropy of Gaussian RV with corresponding variance. 
Or similarly when x = n^, 0 < (3 <1 the Renyi entropy: 


1 2vre(l - n? x ) /'l 

iW(/) = 2 °g- tfZp - + 0[ — 


n< 


where the first term is Shannon’s differential entropy of Gaussian RV with variance a 2 = 
1 — n^~ l 


n 


2-/3 
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(b) In this case when Y n \p) is given in (1 1.12 ji and x = an , the weighted Renyi entropy has 
the form: 

n*(f) = —!-iog / y n) M (/(?>))*> 

J- ^ ./n 


0(”)(p) (/(p)) 1/ dp = UiU 2 U 3 , where 

r(ra + 7V^+ l)r(z/(n-x) + (1 -7)v^+ 1) rr 
h'l — -777-;— r- , ^ - > ^2 — 


U 3 = 


T(vn + yfn + 2) 
T(n + ^/n + 2) 


r(n + 2) 


r(x + i)r(n — x + 1 ) 


u-\ 


T(x + 7: + l)r(n — x + yfn — z + 1) 

log(Ri) = vx\og(x) + Rog(x) + -log(a;) + u(n — x)log(n — x) + ( yfn — £)log(n — x) + 

111 1 (cy — y) 2 

-log(2vr) - -log(i/) + -log(n - x) - un\og(n ) - y/n\og{n) - -log(ra) - log(n) + ——-— + 

2 2 2 2 2a(l — a)v 


:log 


27ra(l — a) 


+ O 


v 


\og(U 2 ) = vriiogiri) — vx\og{x) — v(n — x)log(n — x) + idog(n) + — (log(ra) — log(x) — log(n 

x) — log(27r)) — nlog(n) + adog(a;) + (n — x)log(n — x) — log(n) — -(log(n) — log(x) — log(n 

x) - log(27r)) + O ^1 

log(U 3 ) = log(n) + nlog(n) + y/n\og{n) — xlog(x) — Rog(x) — (n — x)log(n — x) — (y/n 

1 (a — 7 ) 2 1 /1 

z)\og{n-x) + -(log(n) - log(x) - log(2vr) -log(n-x)) - ——-- - -log (2vra(l - a)) + 0 - 

2 2a (1 — a) 2 \n 

Taking all parts together, we obtain that 

H t (f) = - FiFL + (“ ~ (1 - b + O (1) (3.9) 

2 n 2(1 — v) 2a(l — a)(l — u) \v> J \nJ 

Taking the limit when v —» 1 and applying L’Hopital’s rule we get that: 

HfU) = Hm H.V) = ilog 2 -^ 1 ~ tt)l + 

2 n 2a( 1 — a) 


O | 1 

n 


So the weighted Reniy entropy tends to Shannon’s weighted entropy as v —> 1. 


(3.10) 

□ 


Proposition 2. For any continuous random variable X with PDF f(x) and for any non¬ 
negative weight function 4>(x) which satisfies condition L3.33 1) and such that 

[ 0{x)(f{x)) u \log(f(x))\dx < 00, 

Jr 

the weighted Renyi differential entropy Hf(f) is a non-increasing function of v and 


du H (l-i /) 2 


/ \, z{x) 
z ( x F 0 g , . , . dx, 
nx)f{x) 


(3.11) 


where 


z(x = 


<,Kx)(f(x)y 

/ R <f>(x)(f(x)) v dx 


Similarly, the Tsallis weighted entropy S$(f) given in \1.9f) is a non-increasing function of q. 
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Proof. We need to show that 


l H *{{) < 0 . 


dv AI> (l-O 2 (l-^)/,,0(x)(/(x))-dx 


Denote 


2 x = 


Note that z(x ) >0 for any x and 


0 (x)(/(x))^ 

/r^X/OdO^Ix' 

z(x) dx = 1 


(3.12) 


(3.13) 


Let Q\ — I 4>(x)(f(x)ydx and Q 2 = log / 0(x)(/(x)) y dx. 

J M J M 

Using the substitution (13.13[) 


Q 2 = log(0(x)) + zdog(/(x)) - log( 2 (x)). 


(3.14) 


We have that 

1 Q\ f R z(x)log(f (x))dx 


h= 1 

1 — V 

Il + I 2 = 


Q i 


l-i/ 


z(x)log(f(x))dx 


log / 0 (x)(/(x)) i/ dx + (1 — u) / 2 (x)log(/(x))dx ) = 


(1 - ,/)= 


(1 — u ) 2 

By substitution log(/(x)) using (j3. 14j) we get: 

h = Q 2 + if ~ v) f— + - [ 2 (x)log( 2 (x))dx — - [ 2 (x)log( 0 (x))dx ) = 

\ v v J r v Jr J 

— + — f 2 (x)log( 2 (x))dx— f 2 (x)log( 2 (x))dx+ f 2 (x)log(</>(x))dx- f 2 (x)log( 0 (x))dx 

12 12 ./ Iff ./TCP ./TCP 12 ./TCP 


Applying (13.14|) again we get that 

I3 — 2 (x)log(/(x))dx— / 2 (x)log( 2 (x))dx+ / 2 (x)log( 0 (x))dx — — I 2 (x)log 

Jr Jr Jr Jr 

We obtain that 


z(x) 


0 (x)/(x) 


^ (/) = (r^p/u wlog 


2 (x) 


0 (x)/(x) 


dx = 


(1 - //) : 


r D^ L (2||0/). 


(3.15) 


Here D^l( 2 | |0/) is Kullback-Leibler divergence between 2 and ff which is always non-negative. 
Due to conditions 0(x)/(x) > 0 and (13.331) . 0(x)/(x) is itself a PDF: 

/ 0 (x)/(x)dx = 1 


Similarly, one can show that Tsallis weighted differential entropy given in (1 1.9 j) is non-increasing 
function of q. So, the result follows. □ 
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Theorem 5. Let Z ^ be a RV with f ^ - conditional PDF after n trials given by 11.2\) with 
the weighted Tsallis differential entropy S q (f^ n l) given in (W- 

(a) When both ( x ) and (n — x ) tend to infinity as n —> oo and p ) = 1, 


lim 

n—>-oo 




1 

q-1 


1 f2nx(n — x) \ 2 \\ 

Vt v ^ ) )) 


o. 


(3.16) 


For any fixed n the Tsallis differential entropy tends to Shannon’s differential entropy as q —» 1. 
(b) When x = an and the weight function p ) given in 11.12) 


lim 

n—>■ oo 




l 


^27ra(l — ct)\ 


i-i 

2 


exp 


/ (o- 7 ) 2 (l-g) \\\ 

v 2 ck(1 — a)q ) J J 


0 (3.17) 


The weighted Tsallis differential entropy tends to Shannon’s weighted differential entropy RV 
with PDF given in 11.2 j) as q —> 1. 


Remark 1. It can be seen from Theorem f(a) and Theorem 5(a) that for large n Renyi’s 
entropy and Tsallis’s entropy (for <f = 1) ’’behaves” like respective entropies of Gaussian RV 

• 1 7 • 9 x(n—x) 

with variance a = v 3 ’ . 

n 

Proof, (a) In this case (j)^ n \p) = 1, the Tsallis entropy have the form: 

s * (/) =vr 0 -I (/(p,rdp )=0 - [ { {n+i) (")^ (i 

It was shown above that 


log / (/(p)) 9 dp~ - log 


1 — q ( 27 rx(n — x) 


n° 


:log (q) 


So we have that 


Vo = / (f(p)ydp ~ — 


1 — 9 

1 ( 27Tx(n — x) \ 2 


A Jq \ n 3 


We straightforwardly obtain that 


S q (f) * 


q-1 


1 - 




y/q \ n 3 


1 - 8 ' 


1 f2nx{n — x)\ 2 


Note that Vo —> 1 when q —> 1, applying L’Hospital’s rule we get that: 


UmS ff (/) = &(/)--log 


2eirx(n — x) 


rr 


(3.18) 


(3.19) 


The first term in expression above is nothing else but Shannon’s differential entropy of 
Gaussian RV. 

(b) In this case when j s given in (jl. 12j) the Tsallis entropy have the form: 


S}(f) = 


1 


9-1 


1 - / f in \p)(f(p)) q dp 


Using that x = an and by Stirling’s formula, it was shown above that 


log 


f H (p) (f(p)) q dp 


U o 


1 — q 2na(l — a) log(g) (ck — y) 2 /1 


-log 


n 


+ 


2 cc(l — a) \q 


- - 1 
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So we have: 


Vi = J </> (n) 0) ( f(p)Y dp ~ — ^ 


n 


1-9 


1 /27ra(l — cc) \ 2 


exp 


(a ~7) 2 

2a(l — a) 


- - 1 


Weighted Tsallis entropy: 

S*(f(p)) -—ii- 


1 / 27ra(l — a) 


q-1 




Vo\ 


n 


1-0 


exp 


(«~7) 2 
2a(l — a) \q 


- 1 


Note that Vo —> 1 when q —> 1, applying L’Hospital’s rule we get that: 

stif) = lim *(/) m hog 2TOta(1 - Q)1 + Tph. 

9->i H 2 n 2a (1 — a) 


(3.20) 


(3.21) 


Then the weighted Tsallis entropy tends to weighted Shannon’s differential entropy when q —» 

1. □ 


Theorem 6. Let &e a RV with - conditional PDF after n trials given by A1.2 1) . when 
x = an (0 < a < 1) and I(f^) is the weighted Fisher information of given in (1.5): 

(a) When (ff Jl ' > (p) = 1, 


lim 

71—> OO 



a(l — a) 


n 


2a 2 - 2a + 1 
2a 2 (l — a) 2 ‘ 


(3.22) 


(b) When cf)^ n \p) is given in hi. 12) : 


lim 

71—>• CO 


i*uL 


7l)> 


+ 7 T~—) W - B(ot, 7 )y/n 


a(l — a) (1 — a) 2 a 2 


= <?(<*, 7), 


(3.23) 


where L>(a, 7) and C(a, 7) are constants which depend only on a and 7 and are given in h3.29) 
and A3.30) respectively . 

Proof, (a) The Fisher information in the case <f>^ n \p) = 1 and x = an takes the form: 


/(a) = E 


d_ 

da 


i°g/( l; a 


a = 


_9_ 

9a 


log/(p;a) f(p,a)dp, 


where / = fa ' 1 . Next, 

log(/(p, a)) = anlog(p) + (1 — a)nlog(l — p) + log(n + 1)! — log(x!) — log((n — x)!) 


and 

d 

—log/(p; a) = nlog(p) — nlog(l — p) + mp(n — x + 1) — n-0(x + 1), (3.24) 

( —log/(p; a) j = n 2 log 2 (p) +n 2 log 2 (l — p) + n 2 'if 2 {n — x + 1) +n 2, 0 2 (x + 1) — 2n 2 log(p)log(l — 

p) + 2n 2 log(p)-0(n — x + 1) — 2n 2 log(p)-0(x + 1) — 2n 2 log(l — p)if{n — x +1) + 2n 2 log(l —p)-0(x + 
1 ) — 2 n 2 f>(x + 1 )-0(n — x + 1). 

For the following computation of expectation we will need so following integrals: 
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V'ci _ r(n - x + i)r(x + 1 ) 


(log(p))V(l-p) n “^P = 


(■ 0 (n + 2 ) — ^>(x + l )) 2 — ^ l \n + 2 ) + 


jo r(n + 2) 

fi>A)(x + 1), where r(x) is a Gamma function and fi>A)(x) is the first derivative of digamma 
function. 

[ (\og(l-p)) 2 p x (l-p) n ~ x dp = —— p/ + ^x J + l \ ^{n + 2) - if(n-x + l)) 2 - -0 (1) (n + 

./o i (+ + zj 

2 ) + ^^(n — x + 1 ) 

[ log(p)log(l - p)p x (l - p) n ~ x dp = —— w + + l \ ip{n + 2 )-ip(n-x + l)(ij>(n + 

do r(n + 2) 

2) — -0(x + 1)) — -0^(n + 2) 

f iog( P K(i - p)"-V p = h!Lz_L±h£(£±h(_^ (n + 2 ) + v.(x + l)) 

J 0 “r 

log(l — p)p x (l — p) n ~ x dp = — (—if(n + 2) + ■0( n — x + l) 


So, we have that 


r-l 


d 


log/O; a) f(p,at)dp = 


i o \<9a 

2 /n\ r(n — x + l)T(x + 1 ) 


n (n + 1 ) 


x/ T(n + 2 ) 


T(n + 2 ) 


((- 0 (n + 2)) 2 + (- 0 (x+ l )) 2 — 2 - 0 O + 2 )V'(x + 1 ) (n + 


2 ) + // 1 )(x + 1 ) + (ip(n + 2)) 2 + (ip(n — x + l )) 2 — 2fi>(n + 2)fi>(n — x + 1 ) —' 0 < - 1 - ) (n + 2 ) + // 1 )( n — 
x + 1 ) + (fi>(n — x + l )) 2 + (fi>(x + l )) 2 — 2 (fi(n + 2)) 2 + 2 0 (n + 2)if(x + 1 ) + 2fi(n — x + l)^(n + 
2 ) — 2if(n — x + l) , 0 (x + 1 ) + 2 // 1 ) (n + 2 ) — 2fi>(n — x + l)if(n + 2 ) + 2fi>(n — x + l)- 0 (x + 1 ) + 
2 />(x + l)//n + 2 ) — 2(i/j(x + l )) 2 — 2 (fi>(n — x + l )) 2 + 2if(n — x + l)^(n + 2 ) + td(x + l)^(n — 
x + l) — 2 , /(x + l)- 0 (n + 2 ) — 2 '/(x + l)fi>(n — x + 1 )) = 

= n 2 (ip^(x + 1) + if^\n — x+ l)) 


1(a) = n 2 (^ l \x + 1 ) + ip^\n — x + l)) 
Using the asymptotic for the digamma function we can rewrite: 

1 


1(a) = 

Remark 2. When x = an 

where b n (a) is a bias. 

Note that 


1 2 a 2 - 2 a + 1 / 1 

a(l — a) 2 a 2 (l — a ) 2 \n 


pf^dp = a + 0 (a), 


0 (a) ^ 


1 - 2 a 


n 


(3.25) 


(3.26) 


<9 2 

—0(a) ^- 1 0 

aa n 

as n —y oo. So, our estimate is asymptotically unbiased. Also note that the first term in 
Theorem [ 6 ] has the same form as in the classical problem of estimating p in a series of binary 
trials 


p(i-p) ‘ 
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(b) The weighted Fisher Information in the case x = an (0 < a <) takes the following 
form: 


/*(/)=£ [^(”)(p)(|-log/(p;o) 


a 


= jf (p (n \p) (J^-logf(p-,a)^ f(p,a)dp 


where the (f)^ n \p) is given in (1 1.12 p . 


d 


The second term under integral —log f(p\ a) can be found as before exactly. 

da 


Let W = 


T(n — x + 1 + yfn — z)T(x + l + z) 


T (n + 2 + yfn) 
information we will need to compute following integrals. 


So in order to compute the weighted Fisher 


(log (p)) 2 p z+x {l~p) 


\n-x+y/n-z j _ 


dp = W(f(n J r2 + yfn) — f(x + z+ 1)) 2 — f^/n + 2 + yfn) + 


ip^fx + z + 1) 


(log(l — p)) 2 p z+x ( 1 — p) n x+ ^ z dp = W(f(n + 2 + yfn) — i/j(n — x + 1 + yfn — z )) 2 — 


\n + 2 + y/n) + -0 \n — x + 1 + yfn — z) 

r\ 

-nW+Ul -rA n - x+ 'fr- z d'n = 


log(p)log(l -p)p z+x {l-p) 


dp = W(ijj(n + 2 + y/n) -f(n — x + l-\- yfn — z)(f(yn J r 


2 + y/n) - f(x + 1 + z)) - -0 (1) (n + 2 + y/n) 


log(p)r (1 -p) 


Xn-x+Vn-z, _ 


dp = W (—f(n + 2 + y/n) + f(x + 1 + z) 


/ log(l — p)p z+x ( 1 — p) n x+z dp = W + 2 + y/n) + f(n — x + 1 + y/n — z) 

Jo 

Taking all parts together: 

I^(f/ l ' ) ) = n 2 (f^\x + z + 1) + f^/n — x + 1 + y/n — z)) + 


+n 


(f(x + z + 1) — f(x + l)) 2 + (f(n — x + 1 + yfn — z) — ip(n — x + 1))' 

+2n 2 [(f) (n — x + 1) — f(n — x + yfn — z + 1)) (f(x + z + 1) — f(x + 1))] 
Using the asymptotic for the digamma function we can rewrite: 


+ 


where 


1(a) = A(a, 7)n + B(a, 7 )yfn + C(a, 7) + O 


(3.27) 


A(a, 7) = 


1 , (q - if 

a( 1 — a) (1 — a) 2 a 2 


. 2«7 — 7 — a 2 (a — a ) 2 . . . 

5 7) = -7T-+ pi-13-3 («( 2 7 - 1 - 7 

(1 — a ) 2 a z (1 — a ) 6 a A 

, . a — 2 a 4 — 2 f 2 + 6«7 3 + a 3 (2 + 47) — 3 a *7 + 7 2 ) 

= -2(1 - a) 3 a 3 ) h 

, a 4 (—31 - 447 + 72 7 2 - 567 s + 287 4 + 36 a - 12 a 2 ) | 

+ 12(1 - a) 4 a 4 + 

6a 2 (7 2 — 27 s + I27 4 — 1) — 47 3 (ll7 — 44a7 — 6 + 37 2 — 67 s + M7 4 ) 
+ 12(1 -a) 4 a 4 


(3.28) 

(3.29) 


(3.30) 
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A role of the weight function of form (11.121) results in appearance of the term of order -y/n, 
but the main order, n, remains the same. However, the coefficient in front of it is higher by 
(ct — y) 2 

- - .Evidently, when the frequency of special interest is equal to the true frequency the 

(1 — a) 2 a 2 

leading term is the same as in Fisher Information with constant weight. Also note that the rate 
depends on the distance between 7 and a and when 7 — )■ a the only first terms remains. □ 


Weighted inequalities 


Recall that Z g G is the family of RV with PDF fg where 9 G 0 C is the vector of 

parameters of PDF fg. Let 0(z ,#,7) be the continuous positive weight function defined in 
(I3.32|) . 1^(6) be the weighted Fisher information (m x m) matrix given in (I1.16|) and g{6) be 
the weighted expectation given in (11.131) . Let Vg(Z) be the weighted covariance matrix of RV 

Z 0 


Vj(Z) = Ej [(Z - e(#))(Z - e(e)) T ] . 


(3.31) 


We also assume that in (11,13)) and (13.33)) differentiation with respect to the parameters up to 
order to be considered imder the sign of the integration is valid. So, the equality (14.ip (and 
analogous) holds. A sufficient condition for this is that the integrand after the operation of 
differentiation r](6) is bounded by an integrable function \ which does not depend on 6 


to(0)l < X, 


i.e. the integral converges uniformly in 6. 

In the following sections we consider the special class of weight functions which can be 
represented in the following form: 

0( z , 0 , 7) = ,1 s 0(z, 7)- (3-32) 

«( 0 , 7 ) 

Here k(9, 7) G C k where C k is the family of function with continuous derivatives up to order k 
(k will be specified below), and k(9, 7) is found from the normalizing condition 


0( z > 7)/( z )dz = 1 


(3.33) 


as before. Note that the condition (I3.33j) can be rewritten in the following form 


0( z ,7)/(^ z )dz = k(0, 7) 


(3.34) 


where </>(z,7) is a function that have a sharp peak at the point 7 and does not depend on 6. 

I11 the Bayesian framework we consider RV with a PDF /^ = fjf ^ given in (1 1.2 j) 
assuming that x = x{n) = |_airzj, considered in the Bayesian problem stated above [?]. The 
explicit asymptotic expansions for lower bound are obtained in cases of the following weight 
functions: 

0! n) (p) = —7 -yP 7 (! - P) 1 " 7 , (3-35) 

«i(«,7) 
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4 ”’(p) = -f—fN 1 - p) (1 “ 7, 7 (3.36) 

K 2 (a,7) 

4?\v) = —4-^P 7 ”(l -p) (1 - 7) ” (3.37) 

K 3 (a,7) 

where /Cj(a,7), z = 1,2,3 are found from the condition (I3.33|h 
Denote the partial derivative of order j 

fW = &1 

J dQi ' 

Recall if^(x) = if(x) and by \fd\x) the digamma function and its first derivative respec¬ 
tively 

d n+1 

^ w = d^ log(rW) (3 - 38) 

where r(x) is the Gamma-function. In further calculations the asymptotic of these functions 
for x —y oo will be used P #8.362.2] 


<P(x) = log(x) - 4. + O (4) 

# 1) W = - + # + o( 4 ) 

X lx - \x A J 


as x —> oo, 


as x —> oo. 


(3.39) 

(3.40) 


4 Weighted Rao-Cramer inequality 


Theorem 7. (Weighted Rao-Cramer inequality) . Assume that 


dg(9) 

DO 



[f e (z)(t)(z,e,'f)}dz. 


(4.1) 


Note that (0 holds if integral in its RHS converges uniformly in 6. Then the following 
inequality for weighted covariance matrix 'Vg(Z) holds 


V*(Z)> 



«(^7) 


(e(0) - 9(0)) V(f))- 1 



« 7 ( g »7) 

«(^7) 


(e(9)-g(9)) 


(4.2) 


Proof. Consider the following integral 


9(9)= ( z(f>(z,9,'y)fg(9,z)dz. 
J R d 


(4.3) 


Differentiating both sides in (14.31) and in (13.341) with respect to 9 and multiplying the latter 
one by e(9) defined in (11.141) 


dfo 


k'(9, 7) 


z0(z,0, 7 )—dz 2(a s , 

39 k 2 (9, 7 ) 7 0 


zf{z : 'f)f e (9,z)dz = 


dg(9) 
39 ’ 


(m z’ 1 ^ a \ d fo A K '(9, 7) 
e(9) nz,9,^)—dz = 3 e(fl). 




«(0,7) 


(4.4) 

(4.5) 
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Subtracting (I4.4|) from (14.5]) , 


(awx( a d 9 (°) (aw 

(z - e(0))0( z, 9 , V-zz-dp = —zz - , (e(0) - g(9)). 


de 


89 


*(9, 7 ) 


Multiplying and dividing by y/Je, multiplying by conjugate vector and applying Cauchy- 
Schwarz inequality we get 

V?(Z) > C# - (e(9) - g(9)) j m- 1 (- 4142) (e(9) - 9 (9)) j T (4.6) 


\ 50 /c(0,7) 


\ 50 /c(0,7) 


where P p = (0) is the (m x m) Fisher Information matrix dehned in (11.1611 . 


□ 


Theorem 8. Let 6e a with a PDF f'^ l> given in hi. ill) assuming that x = |_anj where 
0 < a < 1. Then 

(a,) When weight function <f>(p) = </>i(p) is given in (1,9. .9,5]) 


eh) 


V 0w z ) > «(!-«) + 1 - 14a + 18a 2 + 2y - 807 + 2y 2 + Q 


n 


2 n 2 


n 5 / 2 


fbj When weight function f>(p) = </> 2 (p) given in 10.001) 

V 0 2 ( Z x > a(l-a) + (a-7) 2 + ~2a + a 2 + 7 + 2ay - 2y 2 + Q f J_ 


n 


n 3 / 2 


n^ 


(c) When weight function cf(p) = 0 3 (p) is given in (]0.07| ) 

V fe (z«) > ( °~ 7)2 + C 3 (a,7)l + 0 ( 

4 n \ 


n 3/2 


where C 3 is a constant which depends only on a and 7 and given explicitly in d^.00[ ). 
Proof, (a) Consider the weight function 


(4.7) 


(4.8) 


(4.9) 


T(j>) = 


Ki(a,7) 


P'U “ P) 


1-7 


(4.10) 


where fti(a, 7) is found from the normalizing condition (13.33(1 . Thus, 

1 r(x + l)r(n — x + l)T(n + 3) 

«n(a, 7) T(x + 7 + l)T(n —x + 2 — 7 )T(n + 2)' 

Note that the normalizing constant depends on n, but the remainder does not contain n and 
a. For a given weight function (I4.10|) the Fisher information equals: 

_ n 2 1)^ + ^ + + ^(i)(77, — x + 1 + 1 — 7)) + 

+n 2 [(- 0 (x + 7 + 1) — - 0 (x + l)) 2 + (^(n — x + 1 + 1— 7) — - 0 ( n — x + l)) 2 ] + (4.11) 

2 n 2 [(^(n — x + 1) — ^(n — x + 1 — 7 + 1)) (^>(x + 7 + 1) — ^(x + 1 ))]. 
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For the weight function (14.1 Oft , integral in (14.3ft can be found explicitly 


M n) /i n) d p = 


T(n + 3) 


r(x + 7 + l)r(n — x — 7 + 2 ) 
r(n + 3)r(x + 7 + 2 ) 


p 


7+7+1n _ p^n-x+ 1-7 


dp 


Then 


r(n + 4)r(a: + 7 + 1) 
dgi(a) r(n + 3)T(x + 7 + 2 ) 


= 9 i(a)- 


= n- 


d a ’ T(n + 4)r(x + 7 + 1) 

Differentiating tc(a,7) we obtain that: 

<(<+7) 


(^(x + 7 + 2) - ip(x + 7 + 1)). 


Ki(a,7) 


Also 


e a = 


T(n + 2) 


r(x + i)r(n - x + 1 ) J 0 


1 x+i/i \n—xj T(n + 2 )r(x + 2 ) 

P +1 - P) d P = Ft , QUV ,1V 
1 (n + 3)1 (x + 1) 


Plugging in (14.1 lft . (I+T2ft . (14. 13ft . (14. 14ft and (14.15ft in (14.2ft we get 

V +( Z H) > «(!-«) + 1 ~ 14a + 18a 2 + 2 7 - 8o+ + 2 7 2 + Q 


n 


2 n? 


n 5/2 


(b) Consider the weight function 


n) (p) = 


:P 


,7 v n 


P) 


(1-7 )+n 


«2(a, 7) 

where K 2 (a, r y) is found from the normalizing condition (13.33ft . 

1 r(x + l)r(n — x + l)r(n + 2 + -07) 


(4.12) 


(4.13) 


= n (ij}(n — x + 1) — ^>{n — x+1 — 7 + 1 )+ ip(x + 7 + 1) — ij){x + 1)). (4.14) 


(4.15) 


(4.16) 


(4.17) 


K 2 (a, 7) T(x + 7 yfn + l)r(n - x + 1 + -y/n - 7^ fn)T(n + 2 )' 

Note that the normalizing constant depends on n as well as the remainder. For a given weight 
function (15.4ft the Fisher information equals: 

— n2 + z + 1) + ip^(n — x + 1 + y/n — z )) + 

{iffx + 2 + 1) — if(x + l)) 2 + ( i/j(n — x + 1 + yfn — z) — if(n — x + l)) 2 + (4.18) 


+n 


2 n 2 — x + 1) — ip(n — x + yfn — z + 1)) (ij}(x + £ + 1) — ijj(x + 1))] 

where z = 7 yfn. 

For the weight function (15.4ft . integral in (14.311 equals 

r(n+v^ + 2)r(x + 7^+2) 

/ P02 Ja >d P = 77—;— 7= , Q ^ r / —;- r- , n = £/2(a . 

Jo T(n +y/n + 3)T(x + ^y/n + l) 


(4.19) 


Then 


dg 2 (a) _T(n + yfn + 2)T(x + 'fyfn + 2) 

T(n + -v/n + 3)r(x + 7^/n + 1) 


da 


= n 


(i/j{x + 7 \Ai + 2) — ijj(x + 7\/n + 1)) . (4.20) 


23 






































Differentiating 772(037) we obtain 

^ = n (A(n — x + 1) — -0 (n — 
«2(a, 7) 


x 


Plugging in (14.18p . (T47T9|) . (j4.20p . (j4.21 [) and (14.15P in (14. 2 p we get 

v 0 2 ( Z (n)) > q( 1 - a) + (q -7) 2 + -2a + a 2 + 7 + 2a7 - 2 7 2 

a — n n 3/2 


- -0(x + 1)) • 

(4.21) 



(4.22) 


(c) Consider the weight function 

4”’(p) = -t^P 7 “(1 - p) (1 - 7) ” (4.23) 

« 3(a,7) 

where «3(a,7) is found from the normalizing condition (I3.33P : 

1 r(x + l)r(n - x + l)r(2n + 2) 

« 3 (a, 7) r(x + 7n + l)T(2n — x + 1 — 777^(77 + 2) ’ 

Note that the normalizing constant depends on n as well as the remainder. Let y = 771 then 
the Fisher Information in this case equals: 

= n 2 (-0^(x + y + 1) + ^ l \ 2 n — x + 1 — y)) + 

+n 2 [(0(x + y + 1) — 0(x + l)) 2 + (ij}(2n — x + 1 — y) — -0(t7 — x + l)) 2 ] + (4.24) 

2n 2 [(0 (t 7 — x + 1) — -0(277, — x — y + 1)) (-0(x + y + 1) — -0(x + 1))]. 

Note that unlike two cases above the differences in brackets do not tend to zero, i.e., 

+ V + 1) - «,(x + 1) = log (^) - 2a(a \ 7)n + O (L) . 

Using (I3.39p and 113. 40H . we obtain 

I*’(fL n) )= flog ( 1(2 _ l a _ + 7 ] ) ) 2 " 2 + Cl («, 7)n + C 2 (a, 7 ) + O (1) (4.25) 

where Ci(a,7) and C 2 (a,7) are constants that depend on a and 7 and can be found explicitly 


Ci = 


a 


2 — a — 7 


log 


7 


1-7 


a(2 — a — 7) 

(1 — a)(a + 7) ^a(a + 7) (1 — a)(2 — a — 7 )J 


(4.26) 


C,= 


7 


log 


a(2 — a — 7) 


6(—1 + a) 2 6(—2 + a + 7) 2 3a(a + 7) 2 6a 2 (a + 7) 2 / (1 — a)(a + 7) 

—4a 6 — 8a 5 (—2 + 7) + (—2 + 7) 2 7 2 — 4a(—2 + 7) 2 7 2 + a 4 (—27 + 2O7) + 4a 3 (6 — 47 — 37 s + 27 s ) 

4(—1 + a) 2 a 2 (—2 + a + 7) 2 (a + 7) 2 
a 2 (-8 + 47 + 22 7 2 - 2O7 3 + 4 7 4 ) 


4(—1 + a) 2 a 2 (—2 + a + 7) 2 (a + 7) 2 
Also note that 


/ \ f 1 A n ) f(n)A r(2n + 2)T(x + y + 2) 
= l P03 /< >d P = r(2n + 3)r(j + y + 1) 

a + 7 1 -a-7 , 0 AJ_ 

2 + 2u + 



(4.27) 


(4.28) 


77 
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It is easy to see that in this case g(a) has different asymptotic comparing two cases above, 
so g{pt ) — E (Z a ) does not tend to zero as before. Proceeding with the same computations as 
before we obtain 

V fe (z«">) > Ezli! +Cs(a>7) i + 0 (_1_) (4,29) 

where C' 3 (a, 7) is a constant depending on a and 7 and can be found explicitly 

c „ = _ ( a ~^) _ x 

48(—1 + a) a (—2 + a + 7) (a + 7)(log(l - a) - log (a) - log(2 - a - 7) + log (a + 7)) 

(-48a 2 + 84a 3 - 24a 4 - 72a7 + 132a 2 7 - 72a 3 7 + 24 7 2 - 12a7 2 - 24a 2 7 2 - 12 7 3 + 24a7 3 + 

((-1 + a)(39a 4 - 2(-2 + - a 3 (50 + 97) + a 2 (-56 + M67 - 1357 2 ) + a7(-44 + 194 7 - 877 2 ))) 

(log(l - a) - log(a))^ 1 

Q, — 1 — ry 

log-(56a 2 — 6a 3 — 89a 4 + 39a 5 + 44a7 — 190a 2 7 + 155a 3 7 — 9a 4 7 — 47 2 — 190a7 2 + 

2 — a — 7 

329a 2 7 2 — 135a 3 7 2 + 27 s + 85a7 3 — 87a 2 7 3 )) 

(4.30) 

□ 


5 Weighted Bhattacharyya inequality 

Theorem 9. (Weighted Bhattacharyya inequality, uniparametric case). 

(a) Let 9 be a scalar parameter, r(9) be a preassigned scalar function of parameter 9. An 
unbiased estimator of t{ 9) is a scalar function T(Z) such that 

e(9) =Ee[T(Z)] =t(0). (5.1) 

Consider the weight function that satisfies the condition A3. 3b 1) . Recall 

g(9) = f T(z)(f>(z, 9, 'y)f e (z)dz. (5.2) 

J R d 

Assume that integrands in 115.2 1) and A 3. 551) converge uniformly in 9 after operation of differen¬ 
tiation up to order v. Then the following inequality for the weighted variance of T holds 

V 

Am > Y (s (i) (9) - Qi + rQi) (g U >(9) - Qi + rQi) jt (5,3) 

»,3=1 

where Q{, i = 1,2 are given in \5.13\ ) and 15.151) respectively and jfj are the elements of the 
matrix JJ^ defined in 15.111) . 

( b ) Consider RV Z ^ with PDF fjf'* given in \1.2\) with x = [an\ where 0 < a < 1. When 
v = 2, 9 = a, T(Z) = for the weight function 

m\p) = —f—yCt - ( 5 . 4 ) 

« 2 (a, 7) 

inequality 15.51) takes the following form 

v£(Z«">)>— + % + o(P) (5.5) 

n n 6 A \ n J 

where C 4 , C 5 are some constants that depend on a and 7 that given explicitly in \5.21\) . 
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Proof, (a) Consider the function R u (z,9): 


RAZ-.0)=T(Z)-T(9)-Y,Kfe ) f, 


(*) f~ 1 

e 




where A* are undefined parameters, ft is easy to note that 

E[R U (Z;9)\ = 0. 


(5.6) 


(5.7) 


Consider the weighted variance given in (|3.32|) of i?„. Because of (15.71) it can be written in the 
following form 

vj(fl„)= [ (t(z)~ 4>(z,0, 7 )/„dz. (5.8) 


i =1 


By the conditions of Theorem the differentiation is justified and leads to the following condition: 

(5.9) 


J^ d - t(9) - Kfo ] f() 1 j z = 0. 


It can be rewritten as 


% I fe ] fe 1 fe )( t>dz = I T(z)</>/ e 0) dz - t{ 9) I (ff^dz. 


U), 


p(j). 


(5.10) 


2 — 1 


Let if be the v x v matrix which elements are 


Ki = / if/P/AVdz 


i,j < u. Let 


n= IK 


-i 


(5.11) 


be the inverse v x v matrix and elements of this matrix are Jf. 

d> 1 

Note that in the case i — j — 1, If equals to the weighted Fisher information given in 

due]). 

Consider integrals in RHS of (I5.10p separately. Firstly, 


~ / 1 \ 

( a \ fo) dz = gd\9), 


T(z)4> 


j -1 

E 

_k =0 


1 \ Ak) 

Je 


Thus, 


where 


k) \k(9, 7) 

/ T{z)(t)fl j) dz = gV\Q) - Q{ 


dz+ / T(z)cf)fd ) dz = g {3 \9). 


Ql = / T(z)4> 


'3- 1 

E 

,k=0 


k) \k(9, 7) 




/( 


(fc) 


dz. 


(5.12) 


(5.13) 
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In the analogous way from the condition (13.331) the following equality can be derived: 


[ 0/ e O) dz = -Q 2 
J R d 

where 

Q J 2 

So, (15.101) takes the form 


and 


Thus, we obtain the following equality 

V 

v«) = V*(T) - £ (g<‘>(») - Q< + tQ‘) (g<>>(») - Q> + tQ? 2 ) jg. 
*,i=i 


. 3-1 

E 

,k =0 




ti-k) 


ft 


(fc) 


dz. 


9 M( 9 ) = ^A*/g + Qi-rQ 


2=1 


v = E (9 0) (») - oi+4- 


j =1 


(5.14) 

(5.15) 

(5.16) 

(5.17) 

(5.18) 


The non-negativity of variance implies the lower bound for weighted variance of T given in 

Q 53 D- 


Remark 3. iVote that this inequality includes the weighted version of Rao- Cramer inequality. 
It appears when t{9) = e{9), 9 = a, g{6) = g(a), T(Z) = Z and i — j — v — 1. In this 
particular case 


4 = i*m = [ viffpwz, 

jR d 


and 


[ (Pfjf ’dz 

J R d 




[ T(z)(j)f^dz = g'{9) + - ^ g(9). 

Ju d K {9, 7 ) 


Thus, we obtain the inequality given in 


(b) The lower bound in (15. 3 p takes the following form: 


(g m W - Qi + tQJ.) (4 + 4 ) (4>(0) - Q? + TQl) + 
+ (9 (2) (9) - Q? + tQ^) 2 4 + (s (1) (9) - Qi + tQT 4 


where Jf 3 are elements of the matrix dehned in (15.lip . Moreover, the asymptotic of /f x 
is given above. Compute the asymptotic of other terms. 


if. 2 = / / (1) / 1 / (2 Vdz = Lin 3 / 2 + L 2 n + L 3 Vn +T 4 + O 
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where Li i = 1, 2, 3,4 are the constants that can be found explicitly and dependent only a and 
7 , but have very large construction, 


_ a 3 + a 2 (-2 + 7) + a(2 - 37)7 + 7 3 
1 ~ (1 - a) 3 a 3 ' 

r -2a 5 - 37 4 + a 4 (—3 + 16c) + 2a 3 (5 - 177 + 7 2 ) , 

2 “ 2(1 — a) 4 a 4 + 

2a7 2 (—4 + 37 + 37 s ) + a 2 (—2 + 67 + 247 2 — I87 3 ) 

2(1 — a) 4 a 4 

_ -2I7 5 + 24a 6 (—1 + 27) + a 5 (13 + 24 7 - I687 2 ) - 2a 2 7 3 (-109 + 72 7 + 367 2 ) 

3 12(—1 + a) 5 a 5 

a 7 3 (—44 + 337 + 72 7 2 ) + a 4 (44 - 2377 + 492 7 2 - 487 s ) + 6a 3 (-2 + IO7 - I97 2 - 567 s + 367 4 ) 

12(—1 + a) 5 a 5 

16a 9 - 157 6 + a 8 (407 - 92) - 4a 6 (-41 + 14 7 + 267 s ) + 2a7 4 (-12 + IO7 + 357 2 ) 

4 “ 8(1 — a) 6 a 6 + 

a 6 (—161 + II87 + 1367 2 + 1527 s ) + a 2 7 2 (24 - 127 + 1577 2 - I2O7 4 ) 

8(1 — a) 6 a 6 

2a 5 (66 - 1587 + 977 2 - 3O87 3 + 4O7 4 + a 4 (-52 + 1487 + 757 2 + I6O7 3 + 4OO7 4 - 2407 s ) 

8(1 — a) 6 a 6 

4a 3 (2 - 67 - 257 2 + 4 7 3 - 977 4 + 6O7 4 + 2O7 6 ) 

8(1 — a) 6 a 6 

The asymptotic of J(f 2 takes the form 



(/ 7 ) 2 / — L§n~ + L 6 u 3 / 2 + Ljti + Lgy/n + Lg + O 



where Lj are some constants again that can be found explicitly and depend on a and 7 . In 
order to compute J(f 2 , one need to compute the integral of the following form 

log(l - p) 1 log (p) j p Al ^>+ A ^ (“.7) ^ ( 1 _ p\A 3 {a,i)n+M{a,i)yfcftp 

for i = 1, 2, 3,4 and j = 1, 2, 3,4 which were computed above for cases i — 1.2 and j = 1,2 and 
one can compute the integral for larger i and j by integration by parts. The only problem with 
deriving the exact coefficients is the computational cost, so we proceed in terms of constants 

Li. 

In order to use the same notation we will write /f, in the following form: 

if] — f {f'Yf 1 (j)dp = Lion + Lny/n + L12 + O 

J R 




where coefficients L 10 ,T 11 ,L 12 are found above. 

Other terms in (j5. 19j) can be computed explicitly. Using the notations of previous section 
we write 


Q 


1 _ 
1 — 


*'(<*, if) 

«(a, 7) 
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9(oi), 














Q\ = 


«(«, 7) 


» r l 


pfifdp - 2 ( g W _ Ql) 
«(«, 7) 


^2 = 


k(o, 7 ) ’ 


and 


^2 = 




" f iSip -2^4-Q\ 

Jo «(a,7) 


where gd) is given in 04.201) . Thus, we obtain the following asymptotic of lower bound for the 
weighted variance in stated Bayesian problem: 


C 4 , C 5 


V*(T) > — + + O ( -4 


n n 3 / 2 


n- 


(5.20) 


where C 4 and C 5 are some constants that depend on a and 7 and can be found explicitly, but 
they also have too cumbersome construction. As an example C 4 is given below: 


Ch = 


2((a — y ) 2 + a( 1 — a))(—2a 2 + a 3 + 2a 7 + a 2 y — 3ay 2 + y 3 )Li 


(1 - a) 3 a 3 (L 2 - L 10 L 5 ) 


+ 


(—2a 2 + a 3 + 2ay + a 2 y — 3ay 2 + y 3 ) 2 Li 0 (1 -a)a) ^ 5 


(5.21) 


(1 — a) 4 a 4 (—L 2 + L 10 T 5 ) —L 2 + L W L 5 

Remark 4. JVoie that in the case a = 7 the first and second term in C 4 vanish. Also one can 
easily check that L\ = 0 in this case. So, because of L w = we have 

C 4 = —— = a(l — a). 

L 10 

Thus, the main term of asymptotic is exactly the same as was obtained above in the standard 
Cramer-Rao case. 


□ 


Theorem 10. (Weighted Bhattacharyya inequality, multiparametric case). Let 

060C R m be a vector of parameters, r(9) = (77 (0),..., t;(#)) t e M* be the preassigned vector 
function of parameter 9 and T(Z) be an unbiased estimate ofr{9): 

e(9) = E g (T) = I T(z)f e (z)dz = r(9). 

J R d 

Consider the weight function <f>(z, 9, 7 ) such that the condition 113.331) holds. Assume that the 
following positively definite matrix exists 

I* = E (5.22) 


where 

0 = (ft w,..., ft(»)) T 

is r-dimensional RV, components of which are all possible expressions of the following form 


1 

fe{Z)d9\\...d9^ f ^ Z/ 


(5.23) 
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where (1 < i\ + ... + i m < s) and r is the total number of all these expressions. 

Let be the (r x l) matrix which rows has the following form 

/ ( T 0 )-^( 0 )) 0(^,7)^— -EzrM z ) dz ( 5 - 24 ) 

J R d ddf,... d9 l ™ 

numbered in the same order as expressions Ii5.23\ ) . Assume that integrands in \5.2f | ) and \3.33\) 
converge uniformly in 6 after the operation of differentiation. Then the following inequality for 
weighted variance of T holds 

Vj(T) > (F*) T J*(0)- 1 F*. (5.25) 

Remark 5. Here and below for (d x d) matrices of the same dimension d, A and B, the 
inequality 

A > B 


means that 


C = A-B 


is a non-negatively definite matrix. 

Proof. Note that elements of matrix F^ can be found from the condition (I3.33p . 
Consider one dimensional RV 


5 = [(T — r) — 


where y T = (y±,... ,y{) e M! is a non-random vector. It is easy to see that Eg(5) = 0. Taking 
weighted expectation of both sides in equality 

<5 2 = y T [(T - t)(T - r)* - 2 (T - t)P*(I 4 , )~ 1 W 1, + (F *)*(I^y 1 pP*y, (5.26) 


for any y we obtain 


e;(S 2 ) = y T 


Vj(T) - (F*) T (J*) _1 F* 


y- 


(5.27) 


The non-negativity of variance implies the multi-parametric version of Bhattacharyya inequal¬ 
ity, given in (15.251) . One can easily see that in uni-parametric and ID case this inequality 
equivalent to the weighted Cramer-Rao inequality. □ 


6 Weighted Kullback inequality 

Theorem 11. (Weighted Kullback inequality) 

(a) For given PDFs f ,g 

K*(f\\g ) > = sup [{t, AA,(/)> + ^gC{g) - log Mgit)] ( 6 . 1 ) 

where 

M g (t) = [ f(z)e {t ’ z) g(z)dz (6.2) 

is a weighted moment generating function, t e M. d and 

“ E f [cf>(Z)} 
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is the classical expectation of f. 

(b) Let Za ^ and Z p n) be RVs with PDF fi n> given in HI. 2 1 ) with x = \oin\ and with PDF 
fp given in \ 1 . 2 \) with x = [pn\ respectively where 0 < a, p < 1 and weight function 




«(P, 7) 

where n(p, 7) is found from normalization condition 


( 6 . 3 ) 


dp = 1. 


(6.4) 


Denote e = a — p then 


K*y}p\\f?>) > 2 L±ZL ") +0( i) 

^ 2(1 — ajcm 


j4s 6 —^ 0, 


n 


n 


31 im 4 ^(/^ll 4 n) ) = ^(/«)> 9 n , n , 

p 2 2a (1 — a) a(l — a) 


+ 0 ( 1 ) 


where J(/ Q ) is the standard Fisher information. 


( 6 . 6 ) 


Proof, (a) The inequality ( 16 . 11 ) is proved in [ 14 ] . 

(b) Firstly, note that by ( 16 . 41 ) : 

log(C(/W)) =0. 

The weighted generating function of RV Zjf 1 ' 1 with PDF f p p equals: 

M f (n){t) = j (j)^e tp f^dp — iFi(pn + 'jy/n + l,n + y/n + 2 \t) 

= 1 | y ^ iT ^ + iyw + nj 
“ fc! H n + y/n + 2 + j 

where iFi(x,y,z) is the confluent hypergeometric function. 

For large n, the expression for weighted generating function can be written in the following 
way m formula 12]: 


m,(.,(*) = 1 + y TT pn+ 1 f i+1+J 

fp ^ k\ n + -v/n + 2 + j 

k =1 j =0 v J 


k k _ x 1 2 /c(p - 2 p 2 - y 2 + pk - 2p^k + 7 2 k) 

T.tAp ~ k{ j> -p 7)75 + -—-—-—^- ——— + 0 

k =0 h + 


2 n 


n 3 / 2 


^4( I _ (p _ 7 ) 4 + 2 (1 -P-^ +2 ( r -^ + ^ + 0 (J_ 


Thus, we have that 


log M f ( n) (t) = pt + log ( 1 - (p - + 


1 2(1 — p — j)t + (p— 2p7 + 7 2 )t 2 


2 u 


+ 0 


n 3/2 
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+ t u 1 , ( l ~P~l) t + V( 1 _ p) 

= pt-(p- ^)t—= +--- 

-v/n n 


O 


77 . 3/2 


The first term in (16. ip for PDF f „ and weight function qf>( n l takes the following form 

1 




?7. 3 / 2 


Then 


(a - p)t - (a - p)—■= + 

n n 


y*f p Mfa)) = SUP 

Finding supremum of the expression above, we obtain 

(a — p) (n — 1 — y/n) 


t p — a (1 — p)p 


2 n 


t 2 + O 


3/2 


77/ 


r = 


(1 — a)a 


+ 0 


n l/2 


So 




(a - p) 2 (1 + y/n - nY 


2(1 — a)a77. 


+ 0 ( 1 ) 


Denote e = a — p. When e —y 0 we obtain 


1 




Thus 


n 


n 


3 lim |/<»>) = i/(/J> 2a(1 _ a) 


+ 0 ( 1 ) 


which completes the proof of Theorem 5. 


( 6 . 6 ) 


(67) 


( 6 . 8 ) 
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